11 research outputs found

    FlatNJ: A novel network-based approach to visualize evolutionary and biogeographical relationships

    Get PDF
    Split networks are a type of phylogenetic network that allow visualization of conflict in evolutionary data. We present a new method for constructing such networks called FlatNetJoining (FlatNJ). A key feature of FlatNJ is that it produces networks that can be drawn in the plane in which labels may appear inside of the network. For complex data sets that involve, for example, non-neutral molecular markers, this can allow additional detail to be visualized as compared to previous methods such as split decomposition and NeighborNet. We illustrate the application of FlatNJ by applying it to whole HIV genome sequences, where recombination has taken place, fluorescent proteins in corals, where ancestral sequences are present, and mitochondrial DNA sequences from gall wasps, where biogeographical relationships are of interest. We find that the networks generated by FlatNJ can facilitate the study of genetic variation in the underlying molecular sequence data and, in particular, may help to investigate processes such as intra-locus recombination. FlatNJ has been implemented in Java and is freely available at www.uea.ac.uk/computing/software/flatnj

    Flat Embeddings of Genetic and Distance Data

    Get PDF
    The idea of displaying data in the plane is very attractive in many different fields of research. This thesis will focus on distance-based phylogenetics and multidimensional scaling (MDS). Both types of method can be viewed as a high-dimensional data reduction to pairwise distances and visualization of the data based on these distances. The difference between phylogenetics and multidimensional scaling is that the first one aims at finding a network or a tree structure that fits the distances, whereas MDS does not fix any structure and objects are simply placed in a low-dimensional space so that distances in the solution fit distances in the input as good as possible. Chapter 1 provides an introduction to the phylogenetics and multidimensional scaling. Chapter 2 focuses on the theoretical background of flat split systems (planar split networks). We prove equivalences between flat split systems, planar split networks and loop-free acyclic oriented matroids of rank three. The latter is a convenient mathematical structure that we used to design the algorithm for computing planar split networks that is described in Chapter 3. We base our approach on the well established agglomerative algorithms Neighbor-Joining and Neighbor-Net. In Chapter 4 we introduce multidimensional scaling and propose a new method for computing MDS plots that is based on the agglomerative approach and spring embeddings. Chapter 5 presents several case studies that we use to compare both of our methods and some classical agglomerative approaches in the distance-based phylogenetics

    SPECTRE: a Suite of PhylogEnetiC Tools for Reticulate Evolution

    Get PDF
    Split-networks are a generalization of phylogenetic trees that have proven to be a powerful tool in phylogenetics. Various ways have been developed for computing such networks, including split-decomposition, NeighborNet, QNet and FlatNJ. Some of these approaches are implemented in the user-friendly SplitsTree software package. However, to give the user the option to adjust and extend these approaches and to facilitate their integration into analysis pipelines, there is a need for robust, open-source implementations of associated data structures and algorithms. Here we present SPECTRE, a readily available, open-source library of data structures written in Java, that comes complete with new implementations of several pre-published algorithms and a basic interactive graphical interface for visualizing planar split networks. SPECTRE also supports the use of longer running algorithms by providing command line interfaces, which can be executed on servers or in High Performance Computing (HPC) environments

    Critical Assessment of Metagenome Interpretation:A benchmark of metagenomics software

    Get PDF
    International audienceIn metagenome analysis, computational methods for assembly, taxonomic profilingand binning are key components facilitating downstream biological datainterpretation. However, a lack of consensus about benchmarking datasets andevaluation metrics complicates proper performance assessment. The CriticalAssessment of Metagenome Interpretation (CAMI) challenge has engaged the globaldeveloper community to benchmark their programs on datasets of unprecedentedcomplexity and realism. Benchmark metagenomes were generated from newlysequenced ~700 microorganisms and ~600 novel viruses and plasmids, includinggenomes with varying degrees of relatedness to each other and to publicly availableones and representing common experimental setups. Across all datasets, assemblyand genome binning programs performed well for species represented by individualgenomes, while performance was substantially affected by the presence of relatedstrains. Taxonomic profiling and binning programs were proficient at high taxonomicranks, with a notable performance decrease below the family level. Parametersettings substantially impacted performances, underscoring the importance ofprogram reproducibility. While highlighting current challenges in computationalmetagenomics, the CAMI results provide a roadmap for software selection to answerspecific research questions

    Flat Embeddings of Genetic and Distance Data

    No full text
    The idea of displaying data in the plane is very attractive in many different fields of research. This thesis will focus on distance-based phylogenetics and multidimensional scaling (MDS). Both types of method can be viewed as a high-dimensional data reduction to pairwise distances and visualization of the data based on these distances. The difference between phylogenetics and multidimensional scaling is that the first one aims at finding a network or a tree structure that fits the distances, whereas MDS does not fix any structure and objects are simply placed in a low-dimensional space so that distances in the solution fit distances in the input as good as possible. Chapter 1 provides an introduction to the phylogenetics and multidimensional scaling. Chapter 2 focuses on the theoretical background of flat split systems (planar split networks). We prove equivalences between flat split systems, planar split networks and loop-free acyclic oriented matroids of rank three. The latter is a convenient mathematical structure that we used to design the algorithm for computing planar split networks that is described in Chapter 3. We base our approach on the well established agglomerative algorithms Neighbor-Joining and Neighbor-Net. In Chapter 4 we introduce multidimensional scaling and propose a new method for computing MDS plots that is based on the agglomerative approach and spring embeddings. Chapter 5 presents several case studies that we use to compare both of our methods and some classical agglomerative approaches in the distance-based phylogenetics

    Additional file 1 of SILVA, RDP, Greengenes, NCBI and OTT — how do these taxonomies compare?

    No full text
    Supplementary material. A PDF file containing supporting data for the figures and detailed visualizations of pairwise mappings. (PDF 197 kb
    corecore